A Comparative Study of Data Clustering Techniques

نویسنده

  • Francisco Azuaje
چکیده

Data clustering is a process of putting similar data into groups. A clustering algorithm partitions a data set into several groups such that the similarity within a group is larger than among groups. This paper reviews four of the most representative off-line clustering techniques: K-means clustering, Fuzzy Cmeans clustering, Mountain clustering, and Subtractive clustering. The techniques are implemented and tested against a medical problem of heart disease diagnosis. Performance and accuracy of the four techniques are presented and compared.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Empirical Comparison of Distance Measures for Multivariate Time Series Clustering

Multivariate time series (MTS) data are ubiquitous in science and daily life, and how to measure their similarity is a core part of MTS analyzing process. Many of the research efforts in this context have focused on proposing novel similarity measures for the underlying data. However, with the countless techniques to estimate similarity between MTS, this field suffers from a lack of comparative...

متن کامل

A Comparative Study of Some Clustering Algorithms on Shape Data

Recently, some statistical studies have been done using the shape data. One of these studies is clustering shape data, which is the main topic of this paper. We are going to study some clustering algorithms on shape data and then introduce the best algorithm based on accuracy, speed, and scalability criteria. In addition, we propose a method for representing the shape data that facilitates and ...

متن کامل

Extraction of Respiratory Signal Based on Image Clustering and Intensity Parameters at Radiotherapy with External Beam: A Comparative Study

Background: Since tumors located in thorax region of body mainly move due to respiration, in the modern radiotherapy, there have been many attempts such as; external markers, strain gage and spirometer represent for monitoring patients’ breathing signal. With the advent of fluoroscopy technique, indirect methods were proposed as an alternative approach to extract patients’ breathing signals...

متن کامل

Improving Vehicular Ad-Hoc Network Stability Using Meta-Heuristic Algorithms

Vehicular ad-hoc network (VANET) is an important component of intelligent transportation systems, in which vehicles are equipped with on-board computing and communication devices which enable vehicle-to-vehicle communication. Consequently, with regard to larger communication due to the greater number of vehicles, stability of connectivity would be a challenging problem. Clustering technique as ...

متن کامل

A Hybrid Time Series Clustering Method Based on Fuzzy C-Means Algorithm: An Agreement Based Clustering Approach

In recent years, the advancement of information gathering technologies such as GPS and GSM networks have led to huge complex datasets such as time series and trajectories. As a result it is essential to use appropriate methods to analyze the produced large raw datasets. Extracting useful information from large data sets has always been one of the most important challenges in different sciences,...

متن کامل

HIERARCHICAL DATA CLUSTERING MODEL FOR ANALYZING PASSENGERS’ TRIP IN HIGHWAYS

One of the most important issues in urban planning is developing sustainable public transportation. The basic condition for this purpose is analyzing current condition especially based on data. Data mining is a set of new techniques that are beyond statistical data analyzing. Clustering techniques is a subset of it that one of it’s techniques used for analyzing passengers’ trip. The result of...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004